Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 475673 |
| Missing cells | 560206 |
| Missing cells (%) | 7.4% |
| Duplicate rows | 996 |
| Duplicate rows (%) | 0.2% |
| Total size in memory | 288.1 MiB |
| Average record size in memory | 635.1 B |
Variable types
| CAT | 10 |
|---|---|
| NUM | 6 |
MAKE has constant value "475673" | Constant |
| Dataset has 996 (0.2%) duplicate rows | Duplicates |
STATE has a high cardinality: 53 distinct values | High cardinality |
ENGINECYLINDERCOUNT is highly correlated with ENGINEDISPLACEMENT | High correlation |
ENGINEDISPLACEMENT is highly correlated with ENGINECYLINDERCOUNT | High correlation |
TRIM has 14955 (3.1%) missing values | Missing |
TRANSMISSIONTYPE has 6330 (1.3%) missing values | Missing |
EXTERIORBASECOLOR has 29420 (6.2%) missing values | Missing |
INTERIORMATERIAL has 142187 (29.9%) missing values | Missing |
BODYCABSTYLE has 315340 (66.3%) missing values | Missing |
ODOMETER has 6952 (1.5%) missing values | Missing |
LISTPRICE has 44089 (9.3%) missing values | Missing |
Reproduction
| Analysis started | 2020-11-18 21:59:52.830549 |
|---|---|
| Analysis finished | 2020-11-18 22:01:33.152504 |
| Duration | 1 minute and 40.32 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
POSTALCODE
Real number (ℝ≥0)
| Distinct | 10017 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 51400.14712 |
|---|---|
| Minimum | 601 |
| Maximum | 99929 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.6 MiB |
Quantile statistics
| Minimum | 601 |
|---|---|
| 5-th percentile | 7735 |
| Q1 | 30097 |
| median | 48838 |
| Q3 | 75751 |
| 95-th percentile | 95136 |
| Maximum | 99929 |
| Range | 99328 |
| Interquartile range (IQR) | 45654 |
Descriptive statistics
| Standard deviation | 26831.5795 |
|---|---|
| Coefficient of variation (CV) | 0.5220136712 |
| Kurtosis | -1.05812849 |
| Mean | 51400.14712 |
| Median Absolute Deviation (MAD) | 21680 |
| Skewness | 0.03901453687 |
| Sum | 2.444966218e+10 |
| Variance | 719933658.6 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 14580 | 1261 | 0.3% | |
| 92335 | 861 | 0.2% | |
| 43125 | 767 | 0.2% | |
| 95661 | 750 | 0.2% | |
| 19382 | 655 | 0.1% | |
| 66048 | 640 | 0.1% | |
| 27616 | 602 | 0.1% | |
| 23452 | 595 | 0.1% | |
| 37204 | 578 | 0.1% | |
| 65203 | 574 | 0.1% | |
| 43616 | 574 | 0.1% | |
| 49512 | 566 | 0.1% | |
| 33612 | 565 | 0.1% | |
| 77034 | 562 | 0.1% | |
| 44503 | 551 | 0.1% | |
| 85297 | 544 | 0.1% | |
| 43537 | 544 | 0.1% | |
| 33619 | 536 | 0.1% | |
| 55901 | 530 | 0.1% | |
| 49080 | 529 | 0.1% | |
| 44035 | 526 | 0.1% | |
| 77074 | 515 | 0.1% | |
| 79936 | 513 | 0.1% | |
| 32505 | 511 | 0.1% | |
| 73114 | 508 | 0.1% | |
| Other values (9992) | 460316 | 96.8% |
| Value | Count | Frequency (%) | |
| 601 | 6 | < 0.1% | |
| 614 | 61 | < 0.1% | |
| 725 | 1 | < 0.1% | |
| 792 | 1 | < 0.1% | |
| 919 | 4 | < 0.1% | |
| 920 | 3 | < 0.1% | |
| 924 | 6 | < 0.1% | |
| 936 | 2 | < 0.1% | |
| 959 | 4 | < 0.1% | |
| 960 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 99929 | 1 | < 0.1% | |
| 99835 | 43 | < 0.1% | |
| 99801 | 2 | < 0.1% | |
| 99701 | 219 | < 0.1% | |
| 99669 | 24 | < 0.1% | |
| 99654 | 141 | < 0.1% | |
| 99611 | 30 | < 0.1% | |
| 99577 | 1 | < 0.1% | |
| 99518 | 86 | < 0.1% | |
| 99515 | 68 | < 0.1% |
| Distinct | 53 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.6 MiB |
| TX | |
|---|---|
| CA | 31711 |
| FL | 25385 |
| OH | 25224 |
| PA | 21867 |
| Other values (48) |
| Value | Count | Frequency (%) | |
| TX | 47230 | 9.9% | |
| CA | 31711 | 6.7% | |
| FL | 25385 | 5.3% | |
| OH | 25224 | 5.3% | |
| PA | 21867 | 4.6% | |
| MI | 21687 | 4.6% | |
| IL | 19057 | 4.0% | |
| NC | 16525 | 3.5% | |
| NY | 16452 | 3.5% | |
| GA | 14818 | 3.1% | |
| IN | 13872 | 2.9% | |
| WI | 13409 | 2.8% | |
| VA | 13130 | 2.8% | |
| MO | 12790 | 2.7% | |
| TN | 11750 | 2.5% | |
| MN | 11496 | 2.4% | |
| WA | 9686 | 2.0% | |
| NJ | 9430 | 2.0% | |
| AZ | 8899 | 1.9% | |
| CO | 8597 | 1.8% | |
| OK | 8202 | 1.7% | |
| MA | 7974 | 1.7% | |
| IA | 7570 | 1.6% | |
| KY | 7538 | 1.6% | |
| MD | 7350 | 1.5% | |
| Other values (28) | 84024 | 17.7% |
Frequencies of value counts
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Most occurring characters
| Value | Count | Frequency (%) | |
| A | 133204 | 14.0% | |
| N | 94523 | 9.9% | |
| I | 80465 | 8.5% | |
| M | 73081 | 7.7% | |
| T | 71638 | 7.5% | |
| C | 67160 | 7.1% | |
| O | 60727 | 6.4% | |
| L | 57171 | 6.0% | |
| X | 47230 | 5.0% | |
| H | 28759 | 3.0% | |
| W | 27255 | 2.9% | |
| F | 25385 | 2.7% | |
| Y | 24964 | 2.6% | |
| K | 23027 | 2.4% | |
| P | 21960 | 2.3% | |
| V | 20278 | 2.1% | |
| S | 19557 | 2.1% | |
| D | 17201 | 1.8% | |
| G | 14837 | 1.6% | |
| R | 10866 | 1.1% | |
| J | 9430 | 1.0% | |
| Z | 8899 | 0.9% | |
| E | 8386 | 0.9% | |
| U | 5343 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Uppercase Letter | 951346 | 100.0% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| A | 133204 | 14.0% | |
| N | 94523 | 9.9% | |
| I | 80465 | 8.5% | |
| M | 73081 | 7.7% | |
| T | 71638 | 7.5% | |
| C | 67160 | 7.1% | |
| O | 60727 | 6.4% | |
| L | 57171 | 6.0% | |
| X | 47230 | 5.0% | |
| H | 28759 | 3.0% | |
| W | 27255 | 2.9% | |
| F | 25385 | 2.7% | |
| Y | 24964 | 2.6% | |
| K | 23027 | 2.4% | |
| P | 21960 | 2.3% | |
| V | 20278 | 2.1% | |
| S | 19557 | 2.1% | |
| D | 17201 | 1.8% | |
| G | 14837 | 1.6% | |
| R | 10866 | 1.1% | |
| J | 9430 | 1.0% | |
| Z | 8899 | 0.9% | |
| E | 8386 | 0.9% | |
| U | 5343 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 951346 | 100.0% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| A | 133204 | 14.0% | |
| N | 94523 | 9.9% | |
| I | 80465 | 8.5% | |
| M | 73081 | 7.7% | |
| T | 71638 | 7.5% | |
| C | 67160 | 7.1% | |
| O | 60727 | 6.4% | |
| L | 57171 | 6.0% | |
| X | 47230 | 5.0% | |
| H | 28759 | 3.0% | |
| W | 27255 | 2.9% | |
| F | 25385 | 2.7% | |
| Y | 24964 | 2.6% | |
| K | 23027 | 2.4% | |
| P | 21960 | 2.3% | |
| V | 20278 | 2.1% | |
| S | 19557 | 2.1% | |
| D | 17201 | 1.8% | |
| G | 14837 | 1.6% | |
| R | 10866 | 1.1% | |
| J | 9430 | 1.0% | |
| Z | 8899 | 0.9% | |
| E | 8386 | 0.9% | |
| U | 5343 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 951346 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| A | 133204 | 14.0% | |
| N | 94523 | 9.9% | |
| I | 80465 | 8.5% | |
| M | 73081 | 7.7% | |
| T | 71638 | 7.5% | |
| C | 67160 | 7.1% | |
| O | 60727 | 6.4% | |
| L | 57171 | 6.0% | |
| X | 47230 | 5.0% | |
| H | 28759 | 3.0% | |
| W | 27255 | 2.9% | |
| F | 25385 | 2.7% | |
| Y | 24964 | 2.6% | |
| K | 23027 | 2.4% | |
| P | 21960 | 2.3% | |
| V | 20278 | 2.1% | |
| S | 19557 | 2.1% | |
| D | 17201 | 1.8% | |
| G | 14837 | 1.6% | |
| R | 10866 | 1.1% | |
| J | 9430 | 1.0% | |
| Z | 8899 | 0.9% | |
| E | 8386 | 0.9% | |
| U | 5343 | 0.6% |
MODELYEAR
Real number (ℝ≥0)
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2016.030101 |
|---|---|
| Minimum | 2013 |
| Maximum | 2018 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.6 MiB |
Quantile statistics
| Minimum | 2013 |
|---|---|
| 5-th percentile | 2013 |
| Q1 | 2015 |
| median | 2017 |
| Q3 | 2017 |
| 95-th percentile | 2018 |
| Maximum | 2018 |
| Range | 5 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.636203544 |
|---|---|
| Coefficient of variation (CV) | 0.0008115967831 |
| Kurtosis | -0.9421869309 |
| Mean | 2016.030101 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.5488263564 |
| Sum | 958971086 |
| Variance | 2.677162038 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) | |
| 2017 | 148969 | 31.3% | |
| 2018 | 93641 | 19.7% | |
| 2016 | 72096 | 15.2% | |
| 2014 | 54482 | 11.5% | |
| 2015 | 53243 | 11.2% | |
| 2013 | 53242 | 11.2% |
| Value | Count | Frequency (%) | |
| 2013 | 53242 | 11.2% | |
| 2014 | 54482 | 11.5% | |
| 2015 | 53243 | 11.2% | |
| 2016 | 72096 | 15.2% | |
| 2017 | 148969 | 31.3% | |
| 2018 | 93641 | 19.7% |
| Value | Count | Frequency (%) | |
| 2018 | 93641 | 19.7% | |
| 2017 | 148969 | 31.3% | |
| 2016 | 72096 | 15.2% | |
| 2015 | 53243 | 11.2% | |
| 2014 | 54482 | 11.5% | |
| 2013 | 53242 | 11.2% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.6 MiB |
| Ford |
|---|
| Value | Count | Frequency (%) | |
| Ford | 475673 | 100.0% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Most occurring characters
| Value | Count | Frequency (%) | |
| F | 475673 | 25.0% | |
| o | 475673 | 25.0% | |
| r | 475673 | 25.0% | |
| d | 475673 | 25.0% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 1427019 | 75.0% | |
| Uppercase Letter | 475673 | 25.0% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| F | 475673 | 100.0% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| o | 475673 | 33.3% | |
| r | 475673 | 33.3% | |
| d | 475673 | 33.3% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 1902692 | 100.0% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| F | 475673 | 25.0% | |
| o | 475673 | 25.0% | |
| r | 475673 | 25.0% | |
| d | 475673 | 25.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 1902692 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| F | 475673 | 25.0% | |
| o | 475673 | 25.0% | |
| r | 475673 | 25.0% | |
| d | 475673 | 25.0% |
MODEL
Categorical
| Distinct | 41 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.6 MiB |
| F-150 | |
|---|---|
| Escape | |
| Explorer | |
| Fusion | |
| Focus | |
| Other values (36) |
| Value | Count | Frequency (%) | |
| F-150 | 113321 | 23.8% | |
| Escape | 73891 | 15.5% | |
| Explorer | 56713 | 11.9% | |
| Fusion | 43090 | 9.1% | |
| Focus | 33894 | 7.1% | |
| Edge | 33549 | 7.1% | |
| Mustang | 20452 | 4.3% | |
| F-250SD | 17934 | 3.8% | |
| F-350SD | 10823 | 2.3% | |
| Fiesta | 10250 | 2.2% | |
| Taurus | 8266 | 1.7% | |
| Expedition | 6207 | 1.3% | |
| Fusion Hybrid | 6000 | 1.3% | |
| Flex | 5672 | 1.2% | |
| Transit Connect | 4850 | 1.0% | |
| Transit-350 | 4332 | 0.9% | |
| Expedition EL | 3739 | 0.8% | |
| EcoSport | 3164 | 0.7% | |
| C-Max Hybrid | 2822 | 0.6% | |
| Transit-250 | 2652 | 0.6% | |
| Fusion Energi | 2533 | 0.5% | |
| Transit-150 | 2210 | 0.5% | |
| C-Max Energi | 2033 | 0.4% | |
| E-350SD | 1322 | 0.3% | |
| F-550SD | 1033 | 0.2% | |
| Other values (16) | 4921 | 1.0% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 26 |
|---|---|
| Median length | 6 |
| Mean length | 6.422460388 |
| Min length | 2 |
Most occurring characters
| Value | Count | Frequency (%) | |
| F | 245865 | 8.0% | |
| s | 212796 | 7.0% | |
| e | 203399 | 6.7% | |
| E | 189919 | 6.2% | |
| o | 166230 | 5.4% | |
| - | 161284 | 5.3% | |
| 5 | 157462 | 5.2% | |
| 0 | 156421 | 5.1% | |
| r | 154200 | 5.0% | |
| p | 145582 | 4.8% | |
| a | 133058 | 4.4% | |
| u | 122773 | 4.0% | |
| c | 118211 | 3.9% | |
| 1 | 116116 | 3.8% | |
| i | 113419 | 3.7% | |
| n | 112381 | 3.7% | |
| x | 79182 | 2.6% | |
| t | 66684 | 2.2% | |
| l | 64001 | 2.1% | |
| g | 58573 | 1.9% | |
| d | 53601 | 1.8% | |
| S | 35902 | 1.2% | |
| D | 32452 | 1.1% | |
| M | 26305 | 0.9% | |
| 24877 | 0.8% | ||
| Other values (20) | 104298 | 3.4% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 1822334 | 59.7% | |
| Uppercase Letter | 577217 | 18.9% | |
| Decimal Number | 469279 | 15.4% | |
| Dash Punctuation | 161284 | 5.3% | |
| Space Separator | 24877 | 0.8% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| F | 245865 | 42.6% | |
| E | 189919 | 32.9% | |
| S | 35902 | 6.2% | |
| D | 32452 | 5.6% | |
| M | 26305 | 4.6% | |
| T | 22323 | 3.9% | |
| C | 9747 | 1.7% | |
| H | 8822 | 1.5% | |
| L | 3739 | 0.6% | |
| P | 794 | 0.1% | |
| I | 794 | 0.1% | |
| U | 542 | 0.1% | |
| R | 6 | < 0.1% | |
| G | 5 | < 0.1% | |
| A | 2 | < 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| s | 212796 | 11.7% | |
| e | 203399 | 11.2% | |
| o | 166230 | 9.1% | |
| r | 154200 | 8.5% | |
| p | 145582 | 8.0% | |
| a | 133058 | 7.3% | |
| u | 122773 | 6.7% | |
| c | 118211 | 6.5% | |
| i | 113419 | 6.2% | |
| n | 112381 | 6.2% | |
| x | 79182 | 4.3% | |
| t | 66684 | 3.7% | |
| l | 64001 | 3.5% | |
| g | 58573 | 3.2% | |
| d | 53601 | 2.9% | |
| y | 9364 | 0.5% | |
| b | 8822 | 0.5% | |
| h | 34 | < 0.1% | |
| m | 24 | < 0.1% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 161284 | 100.0% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 24877 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 5 | 157462 | 33.6% | |
| 0 | 156421 | 33.3% | |
| 1 | 116116 | 24.7% | |
| 2 | 21455 | 4.6% | |
| 3 | 16477 | 3.5% | |
| 4 | 1181 | 0.3% | |
| 6 | 91 | < 0.1% | |
| 7 | 68 | < 0.1% | |
| 9 | 8 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 2399551 | 78.5% | |
| Common | 655440 | 21.5% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| F | 245865 | 10.2% | |
| s | 212796 | 8.9% | |
| e | 203399 | 8.5% | |
| E | 189919 | 7.9% | |
| o | 166230 | 6.9% | |
| r | 154200 | 6.4% | |
| p | 145582 | 6.1% | |
| a | 133058 | 5.5% | |
| u | 122773 | 5.1% | |
| c | 118211 | 4.9% | |
| i | 113419 | 4.7% | |
| n | 112381 | 4.7% | |
| x | 79182 | 3.3% | |
| t | 66684 | 2.8% | |
| l | 64001 | 2.7% | |
| g | 58573 | 2.4% | |
| d | 53601 | 2.2% | |
| S | 35902 | 1.5% | |
| D | 32452 | 1.4% | |
| M | 26305 | 1.1% | |
| T | 22323 | 0.9% | |
| C | 9747 | 0.4% | |
| y | 9364 | 0.4% | |
| H | 8822 | 0.4% | |
| b | 8822 | 0.4% | |
| Other values (9) | 5940 | 0.2% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| - | 161284 | 24.6% | |
| 5 | 157462 | 24.0% | |
| 0 | 156421 | 23.9% | |
| 1 | 116116 | 17.7% | |
| 24877 | 3.8% | ||
| 2 | 21455 | 3.3% | |
| 3 | 16477 | 2.5% | |
| 4 | 1181 | 0.2% | |
| 6 | 91 | < 0.1% | |
| 7 | 68 | < 0.1% | |
| 9 | 8 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 3054991 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| F | 245865 | 8.0% | |
| s | 212796 | 7.0% | |
| e | 203399 | 6.7% | |
| E | 189919 | 6.2% | |
| o | 166230 | 5.4% | |
| - | 161284 | 5.3% | |
| 5 | 157462 | 5.2% | |
| 0 | 156421 | 5.1% | |
| r | 154200 | 5.0% | |
| p | 145582 | 4.8% | |
| a | 133058 | 4.4% | |
| u | 122773 | 4.0% | |
| c | 118211 | 3.9% | |
| 1 | 116116 | 3.8% | |
| i | 113419 | 3.7% | |
| n | 112381 | 3.7% | |
| x | 79182 | 2.6% | |
| t | 66684 | 2.2% | |
| l | 64001 | 2.1% | |
| g | 58573 | 1.9% | |
| d | 53601 | 1.8% | |
| S | 35902 | 1.2% | |
| D | 32452 | 1.1% | |
| M | 26305 | 0.9% | |
| 24877 | 0.8% | ||
| Other values (20) | 104298 | 3.4% |
| Distinct | 39 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 14955 |
| Missing (%) | 3.1% |
| Memory size | 3.6 MiB |
| SE | |
|---|---|
| XLT | |
| Titanium | |
| SEL | |
| XL | |
| Other values (34) |
| Value | Count | Frequency (%) | |
| SE | 123488 | 26.0% | |
| XLT | 101190 | 21.3% | |
| Titanium | 36849 | 7.7% | |
| SEL | 33832 | 7.1% | |
| XL | 32738 | 6.9% | |
| Lariat | 23518 | 4.9% | |
| Limited | 23184 | 4.9% | |
| S | 13954 | 2.9% | |
| Base | 11287 | 2.4% | |
| Platinum | 10734 | 2.3% | |
| Sport | 10354 | 2.2% | |
| King Ranch | 4789 | 1.0% | |
| GT | 4500 | 0.9% | |
| V6 | 4349 | 0.9% | |
| EcoBoost | 3134 | 0.7% | |
| GT Premium | 3128 | 0.7% | |
| ST | 2996 | 0.6% | |
| EcoBoost Premium | 2878 | 0.6% | |
| FX4 | 2846 | 0.6% | |
| Raptor | 2307 | 0.5% | |
| STX | 1707 | 0.4% | |
| SE Luxury | 1401 | 0.3% | |
| Shelby GT350 | 854 | 0.2% | |
| SVT Raptor | 840 | 0.2% | |
| SHO | 784 | 0.2% | |
| Other values (14) | 3077 | 0.6% | |
| (Missing) | 14955 | 3.1% |
Frequencies of value counts
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 27 |
|---|---|
| Median length | 3 |
| Mean length | 3.81406975 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| L | 215964 | 11.9% | |
| S | 192037 | 10.6% | |
| i | 166629 | 9.2% | |
| E | 165359 | 9.1% | |
| T | 152404 | 8.4% | |
| X | 138876 | 7.7% | |
| a | 129484 | 7.1% | |
| t | 113931 | 6.3% | |
| n | 87137 | 4.8% | |
| m | 85349 | 4.7% | |
| u | 57023 | 3.1% | |
| r | 45744 | 2.5% | |
| e | 42958 | 2.4% | |
| o | 32332 | 1.8% | |
| d | 23316 | 1.3% | |
| s | 17583 | 1.0% | |
| B | 17408 | 1.0% | |
| P | 17372 | 1.0% | |
| 15136 | 0.8% | ||
| p | 13501 | 0.7% | |
| l | 12452 | 0.7% | |
| c | 11422 | 0.6% | |
| G | 8788 | 0.5% | |
| R | 8419 | 0.5% | |
| h | 5885 | 0.3% | |
| Other values (17) | 37741 | 2.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Uppercase Letter | 932685 | 51.4% | |
| Lowercase Letter | 854529 | 47.1% | |
| Space Separator | 15136 | 0.8% | |
| Decimal Number | 11900 | 0.7% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| i | 166629 | 19.5% | |
| a | 129484 | 15.2% | |
| t | 113931 | 13.3% | |
| n | 87137 | 10.2% | |
| m | 85349 | 10.0% | |
| u | 57023 | 6.7% | |
| r | 45744 | 5.4% | |
| e | 42958 | 5.0% | |
| o | 32332 | 3.8% | |
| d | 23316 | 2.7% | |
| s | 17583 | 2.1% | |
| p | 13501 | 1.6% | |
| l | 12452 | 1.5% | |
| c | 11422 | 1.3% | |
| h | 5885 | 0.7% | |
| g | 4789 | 0.6% | |
| y | 2497 | 0.3% | |
| x | 1401 | 0.2% | |
| b | 1096 | 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| L | 215964 | 23.2% | |
| S | 192037 | 20.6% | |
| E | 165359 | 17.7% | |
| T | 152404 | 16.3% | |
| X | 138876 | 14.9% | |
| B | 17408 | 1.9% | |
| P | 17372 | 1.9% | |
| G | 8788 | 0.9% | |
| R | 8419 | 0.9% | |
| V | 5806 | 0.6% | |
| K | 4789 | 0.5% | |
| F | 3208 | 0.3% | |
| H | 785 | 0.1% | |
| O | 784 | 0.1% | |
| C | 620 | 0.1% | |
| Y | 66 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 15136 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 6 | 4948 | 41.6% | |
| 4 | 2846 | 23.9% | |
| 0 | 1511 | 12.7% | |
| 5 | 1161 | 9.8% | |
| 3 | 963 | 8.1% | |
| 2 | 471 | 4.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 1787214 | 98.5% | |
| Common | 27036 | 1.5% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| L | 215964 | 12.1% | |
| S | 192037 | 10.7% | |
| i | 166629 | 9.3% | |
| E | 165359 | 9.3% | |
| T | 152404 | 8.5% | |
| X | 138876 | 7.8% | |
| a | 129484 | 7.2% | |
| t | 113931 | 6.4% | |
| n | 87137 | 4.9% | |
| m | 85349 | 4.8% | |
| u | 57023 | 3.2% | |
| r | 45744 | 2.6% | |
| e | 42958 | 2.4% | |
| o | 32332 | 1.8% | |
| d | 23316 | 1.3% | |
| s | 17583 | 1.0% | |
| B | 17408 | 1.0% | |
| P | 17372 | 1.0% | |
| p | 13501 | 0.8% | |
| l | 12452 | 0.7% | |
| c | 11422 | 0.6% | |
| G | 8788 | 0.5% | |
| R | 8419 | 0.5% | |
| h | 5885 | 0.3% | |
| V | 5806 | 0.3% | |
| Other values (10) | 20035 | 1.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 15136 | 56.0% | ||
| 6 | 4948 | 18.3% | |
| 4 | 2846 | 10.5% | |
| 0 | 1511 | 5.6% | |
| 5 | 1161 | 4.3% | |
| 3 | 963 | 3.6% | |
| 2 | 471 | 1.7% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 1814250 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| L | 215964 | 11.9% | |
| S | 192037 | 10.6% | |
| i | 166629 | 9.2% | |
| E | 165359 | 9.1% | |
| T | 152404 | 8.4% | |
| X | 138876 | 7.7% | |
| a | 129484 | 7.1% | |
| t | 113931 | 6.3% | |
| n | 87137 | 4.8% | |
| m | 85349 | 4.7% | |
| u | 57023 | 3.1% | |
| r | 45744 | 2.5% | |
| e | 42958 | 2.4% | |
| o | 32332 | 1.8% | |
| d | 23316 | 1.3% | |
| s | 17583 | 1.0% | |
| B | 17408 | 1.0% | |
| P | 17372 | 1.0% | |
| 15136 | 0.8% | ||
| p | 13501 | 0.7% | |
| l | 12452 | 0.7% | |
| c | 11422 | 0.6% | |
| G | 8788 | 0.5% | |
| R | 8419 | 0.5% | |
| h | 5885 | 0.3% | |
| Other values (17) | 37741 | 2.1% |
| Distinct | 22 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 431 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.110341047 |
|---|---|
| Minimum | 1 |
| Maximum | 6.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1.5 |
| Q1 | 2 |
| median | 2.7 |
| Q3 | 3.5 |
| 95-th percentile | 6.2 |
| Maximum | 6.8 |
| Range | 5.8 |
| Interquartile range (IQR) | 1.5 |
Descriptive statistics
| Standard deviation | 1.390881585 |
|---|---|
| Coefficient of variation (CV) | 0.447179767 |
| Kurtosis | 0.3083342061 |
| Mean | 3.110341047 |
| Median Absolute Deviation (MAD) | 0.8 |
| Skewness | 0.9529910523 |
| Sum | 1478164.7 |
| Variance | 1.934551583 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=22)
| Value | Count | Frequency (%) | |
| 3.5 | 132728 | 27.9% | |
| 2 | 96437 | 20.3% | |
| 1.5 | 44784 | 9.4% | |
| 5 | 43161 | 9.1% | |
| 2.7 | 31133 | 6.5% | |
| 2.5 | 27419 | 5.8% | |
| 1.6 | 26538 | 5.6% | |
| 6.7 | 20444 | 4.3% | |
| 3.7 | 15741 | 3.3% | |
| 2.3 | 13505 | 2.8% | |
| 6.2 | 10507 | 2.2% | |
| 5.4 | 3437 | 0.7% | |
| 1 | 2919 | 0.6% | |
| 3.3 | 1786 | 0.4% | |
| 4.6 | 1249 | 0.3% | |
| 5.2 | 856 | 0.2% | |
| 6 | 770 | 0.2% | |
| 6.8 | 752 | 0.2% | |
| 3 | 498 | 0.1% | |
| 3.2 | 337 | 0.1% | |
| 5.8 | 240 | 0.1% | |
| 5.7 | 1 | < 0.1% | |
| (Missing) | 431 | 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 2919 | 0.6% | |
| 1.5 | 44784 | 9.4% | |
| 1.6 | 26538 | 5.6% | |
| 2 | 96437 | 20.3% | |
| 2.3 | 13505 | 2.8% | |
| 2.5 | 27419 | 5.8% | |
| 2.7 | 31133 | 6.5% | |
| 3 | 498 | 0.1% | |
| 3.2 | 337 | 0.1% | |
| 3.3 | 1786 | 0.4% |
| Value | Count | Frequency (%) | |
| 6.8 | 752 | 0.2% | |
| 6.7 | 20444 | 4.3% | |
| 6.2 | 10507 | 2.2% | |
| 6 | 770 | 0.2% | |
| 5.8 | 240 | 0.1% | |
| 5.7 | 1 | < 0.1% | |
| 5.4 | 3437 | 0.7% | |
| 5.2 | 856 | 0.2% | |
| 5 | 43161 | 9.1% | |
| 4.6 | 1249 | 0.3% |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 367 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.449142237 |
|---|---|
| Minimum | 3 |
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.6 MiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 4 |
| median | 6 |
| Q3 | 6 |
| 95-th percentile | 8 |
| Maximum | 10 |
| Range | 7 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.489566789 |
|---|---|
| Coefficient of variation (CV) | 0.2733580303 |
| Kurtosis | -0.9200644719 |
| Mean | 5.449142237 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.4960406441 |
| Sum | 2590010 |
| Variance | 2.218809218 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) | |
| 4 | 208732 | 43.9% | |
| 6 | 181932 | 38.2% | |
| 8 | 80406 | 16.9% | |
| 3 | 2919 | 0.6% | |
| 10 | 980 | 0.2% | |
| 5 | 337 | 0.1% | |
| (Missing) | 367 | 0.1% |
| Value | Count | Frequency (%) | |
| 3 | 2919 | 0.6% | |
| 4 | 208732 | 43.9% | |
| 5 | 337 | 0.1% | |
| 6 | 181932 | 38.2% | |
| 8 | 80406 | 16.9% | |
| 10 | 980 | 0.2% |
| Value | Count | Frequency (%) | |
| 10 | 980 | 0.2% | |
| 8 | 80406 | 16.9% | |
| 6 | 181932 | 38.2% | |
| 5 | 337 | 0.1% | |
| 4 | 208732 | 43.9% | |
| 3 | 2919 | 0.6% |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 6330 |
| Missing (%) | 1.3% |
| Memory size | 3.6 MiB |
| Automatic | |
|---|---|
| Manual | 15545 |
| CVT | 13388 |
| Value | Count | Frequency (%) | |
| Automatic | 440410 | 92.6% | |
| Manual | 15545 | 3.3% | |
| CVT | 13388 | 2.8% | |
| (Missing) | 6330 | 1.3% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.653242879 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| t | 880820 | 21.4% | |
| a | 477830 | 11.6% | |
| u | 455955 | 11.1% | |
| A | 440410 | 10.7% | |
| o | 440410 | 10.7% | |
| m | 440410 | 10.7% | |
| i | 440410 | 10.7% | |
| c | 440410 | 10.7% | |
| n | 28205 | 0.7% | |
| M | 15545 | 0.4% | |
| l | 15545 | 0.4% | |
| C | 13388 | 0.3% | |
| V | 13388 | 0.3% | |
| T | 13388 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 3619995 | 87.9% | |
| Uppercase Letter | 496119 | 12.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| A | 440410 | 88.8% | |
| M | 15545 | 3.1% | |
| C | 13388 | 2.7% | |
| V | 13388 | 2.7% | |
| T | 13388 | 2.7% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| t | 880820 | 24.3% | |
| a | 477830 | 13.2% | |
| u | 455955 | 12.6% | |
| o | 440410 | 12.2% | |
| m | 440410 | 12.2% | |
| i | 440410 | 12.2% | |
| c | 440410 | 12.2% | |
| n | 28205 | 0.8% | |
| l | 15545 | 0.4% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 4116114 | 100.0% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| t | 880820 | 21.4% | |
| a | 477830 | 11.6% | |
| u | 455955 | 11.1% | |
| A | 440410 | 10.7% | |
| o | 440410 | 10.7% | |
| m | 440410 | 10.7% | |
| i | 440410 | 10.7% | |
| c | 440410 | 10.7% | |
| n | 28205 | 0.7% | |
| M | 15545 | 0.4% | |
| l | 15545 | 0.4% | |
| C | 13388 | 0.3% | |
| V | 13388 | 0.3% | |
| T | 13388 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 4116114 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| t | 880820 | 21.4% | |
| a | 477830 | 11.6% | |
| u | 455955 | 11.1% | |
| A | 440410 | 10.7% | |
| o | 440410 | 10.7% | |
| m | 440410 | 10.7% | |
| i | 440410 | 10.7% | |
| c | 440410 | 10.7% | |
| n | 28205 | 0.7% | |
| M | 15545 | 0.4% | |
| l | 15545 | 0.4% | |
| C | 13388 | 0.3% | |
| V | 13388 | 0.3% | |
| T | 13388 | 0.3% |
DRIVETRAINTYPE
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 85 |
| Missing (%) | < 0.1% |
| Memory size | 3.6 MiB |
| 4WD | |
|---|---|
| FWD | |
| RWD | |
| AWD |
| Value | Count | Frequency (%) | |
| 4WD | 190584 | 40.1% | |
| FWD | 180884 | 38.0% | |
| RWD | 60159 | 12.6% | |
| AWD | 43961 | 9.2% | |
| (Missing) | 85 | < 0.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| W | 475588 | 33.3% | |
| D | 475588 | 33.3% | |
| 4 | 190584 | 13.4% | |
| F | 180884 | 12.7% | |
| R | 60159 | 4.2% | |
| A | 43961 | 3.1% | |
| n | 170 | < 0.1% | |
| a | 85 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Uppercase Letter | 1236180 | 86.6% | |
| Decimal Number | 190584 | 13.4% | |
| Lowercase Letter | 255 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| W | 475588 | 38.5% | |
| D | 475588 | 38.5% | |
| F | 180884 | 14.6% | |
| R | 60159 | 4.9% | |
| A | 43961 | 3.6% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| n | 170 | 66.7% | |
| a | 85 | 33.3% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 4 | 190584 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 1236435 | 86.6% | |
| Common | 190584 | 13.4% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| W | 475588 | 38.5% | |
| D | 475588 | 38.5% | |
| F | 180884 | 14.6% | |
| R | 60159 | 4.9% | |
| A | 43961 | 3.6% | |
| n | 170 | < 0.1% | |
| a | 85 | < 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 4 | 190584 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 1427019 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| W | 475588 | 33.3% | |
| D | 475588 | 33.3% | |
| 4 | 190584 | 13.4% | |
| F | 180884 | 12.7% | |
| R | 60159 | 4.2% | |
| A | 43961 | 3.1% | |
| n | 170 | < 0.1% | |
| a | 85 | < 0.1% |
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 29420 |
| Missing (%) | 6.2% |
| Memory size | 3.6 MiB |
| White | |
|---|---|
| Black | |
| Gray | |
| Silver | |
| Red | |
| Other values (8) |
| Value | Count | Frequency (%) | |
| White | 116846 | 24.6% | |
| Black | 92442 | 19.4% | |
| Gray | 68472 | 14.4% | |
| Silver | 53888 | 11.3% | |
| Red | 52433 | 11.0% | |
| Blue | 36862 | 7.7% | |
| Gold | 9939 | 2.1% | |
| Brown | 6466 | 1.4% | |
| Green | 3255 | 0.7% | |
| Orange | 2695 | 0.6% | |
| Beige | 1901 | 0.4% | |
| Yellow | 943 | 0.2% | |
| Purple | 111 | < 0.1% | |
| (Missing) | 29420 | 6.2% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 6 |
|---|---|
| Median length | 5 |
| Mean length | 4.534676133 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| e | 274090 | 12.7% | |
| l | 195128 | 9.0% | |
| a | 193029 | 8.9% | |
| i | 172635 | 8.0% | |
| B | 137671 | 6.4% | |
| r | 134887 | 6.3% | |
| W | 116846 | 5.4% | |
| h | 116846 | 5.4% | |
| t | 116846 | 5.4% | |
| c | 92442 | 4.3% | |
| k | 92442 | 4.3% | |
| G | 81666 | 3.8% | |
| n | 71256 | 3.3% | |
| y | 68472 | 3.2% | |
| d | 62372 | 2.9% | |
| S | 53888 | 2.5% | |
| v | 53888 | 2.5% | |
| R | 52433 | 2.4% | |
| u | 36973 | 1.7% | |
| o | 17348 | 0.8% | |
| w | 7409 | 0.3% | |
| g | 4596 | 0.2% | |
| O | 2695 | 0.1% | |
| Y | 943 | < 0.1% | |
| P | 111 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 1710770 | 79.3% | |
| Uppercase Letter | 446253 | 20.7% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| B | 137671 | 30.9% | |
| W | 116846 | 26.2% | |
| G | 81666 | 18.3% | |
| S | 53888 | 12.1% | |
| R | 52433 | 11.7% | |
| O | 2695 | 0.6% | |
| Y | 943 | 0.2% | |
| P | 111 | < 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| e | 274090 | 16.0% | |
| l | 195128 | 11.4% | |
| a | 193029 | 11.3% | |
| i | 172635 | 10.1% | |
| r | 134887 | 7.9% | |
| h | 116846 | 6.8% | |
| t | 116846 | 6.8% | |
| c | 92442 | 5.4% | |
| k | 92442 | 5.4% | |
| n | 71256 | 4.2% | |
| y | 68472 | 4.0% | |
| d | 62372 | 3.6% | |
| v | 53888 | 3.1% | |
| u | 36973 | 2.2% | |
| o | 17348 | 1.0% | |
| w | 7409 | 0.4% | |
| g | 4596 | 0.3% | |
| p | 111 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 2157023 | 100.0% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| e | 274090 | 12.7% | |
| l | 195128 | 9.0% | |
| a | 193029 | 8.9% | |
| i | 172635 | 8.0% | |
| B | 137671 | 6.4% | |
| r | 134887 | 6.3% | |
| W | 116846 | 5.4% | |
| h | 116846 | 5.4% | |
| t | 116846 | 5.4% | |
| c | 92442 | 4.3% | |
| k | 92442 | 4.3% | |
| G | 81666 | 3.8% | |
| n | 71256 | 3.3% | |
| y | 68472 | 3.2% | |
| d | 62372 | 2.9% | |
| S | 53888 | 2.5% | |
| v | 53888 | 2.5% | |
| R | 52433 | 2.4% | |
| u | 36973 | 1.7% | |
| o | 17348 | 0.8% | |
| w | 7409 | 0.3% | |
| g | 4596 | 0.2% | |
| O | 2695 | 0.1% | |
| Y | 943 | < 0.1% | |
| P | 111 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 2157023 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| e | 274090 | 12.7% | |
| l | 195128 | 9.0% | |
| a | 193029 | 8.9% | |
| i | 172635 | 8.0% | |
| B | 137671 | 6.4% | |
| r | 134887 | 6.3% | |
| W | 116846 | 5.4% | |
| h | 116846 | 5.4% | |
| t | 116846 | 5.4% | |
| c | 92442 | 4.3% | |
| k | 92442 | 4.3% | |
| G | 81666 | 3.8% | |
| n | 71256 | 3.3% | |
| y | 68472 | 3.2% | |
| d | 62372 | 2.9% | |
| S | 53888 | 2.5% | |
| v | 53888 | 2.5% | |
| R | 52433 | 2.4% | |
| u | 36973 | 1.7% | |
| o | 17348 | 0.8% | |
| w | 7409 | 0.3% | |
| g | 4596 | 0.2% | |
| O | 2695 | 0.1% | |
| Y | 943 | < 0.1% | |
| P | 111 | < 0.1% |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 142187 |
| Missing (%) | 29.9% |
| Memory size | 3.6 MiB |
| Cloth | |
|---|---|
| Leather | |
| Vinyl | 12633 |
| Artificial Leather | 1202 |
| cloth | 14 |
| Other values (2) | 2 |
| Value | Count | Frequency (%) | |
| Cloth | 184114 | 38.7% | |
| Leather | 135521 | 28.5% | |
| Vinyl | 12633 | 2.7% | |
| Artificial Leather | 1202 | 0.3% | |
| cloth | 14 | < 0.1% | |
| 43625 | 1 | < 0.1% | |
| 8725 | 1 | < 0.1% | |
| (Missing) | 142187 | 29.9% |
Frequencies of value counts
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 18 |
|---|---|
| Median length | 5 |
| Mean length | 5.004820538 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| t | 322053 | 13.5% | |
| h | 320851 | 13.5% | |
| n | 297007 | 12.5% | |
| a | 280112 | 11.8% | |
| e | 273446 | 11.5% | |
| l | 197963 | 8.3% | |
| o | 184128 | 7.7% | |
| C | 184114 | 7.7% | |
| r | 137925 | 5.8% | |
| L | 136723 | 5.7% | |
| i | 16239 | 0.7% | |
| V | 12633 | 0.5% | |
| y | 12633 | 0.5% | |
| c | 1216 | 0.1% | |
| A | 1202 | 0.1% | |
| f | 1202 | 0.1% | |
| 1202 | 0.1% | ||
| 2 | 2 | < 0.1% | |
| 5 | 2 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 7 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 2044775 | 85.9% | |
| Uppercase Letter | 334672 | 14.1% | |
| Space Separator | 1202 | 0.1% | |
| Decimal Number | 9 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| C | 184114 | 55.0% | |
| L | 136723 | 40.9% | |
| V | 12633 | 3.8% | |
| A | 1202 | 0.4% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| t | 322053 | 15.8% | |
| h | 320851 | 15.7% | |
| n | 297007 | 14.5% | |
| a | 280112 | 13.7% | |
| e | 273446 | 13.4% | |
| l | 197963 | 9.7% | |
| o | 184128 | 9.0% | |
| r | 137925 | 6.7% | |
| i | 16239 | 0.8% | |
| y | 12633 | 0.6% | |
| c | 1216 | 0.1% | |
| f | 1202 | 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 1202 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 2 | 2 | 22.2% | |
| 5 | 2 | 22.2% | |
| 8 | 1 | 11.1% | |
| 7 | 1 | 11.1% | |
| 4 | 1 | 11.1% | |
| 3 | 1 | 11.1% | |
| 6 | 1 | 11.1% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 2379447 | 99.9% | |
| Common | 1211 | 0.1% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| t | 322053 | 13.5% | |
| h | 320851 | 13.5% | |
| n | 297007 | 12.5% | |
| a | 280112 | 11.8% | |
| e | 273446 | 11.5% | |
| l | 197963 | 8.3% | |
| o | 184128 | 7.7% | |
| C | 184114 | 7.7% | |
| r | 137925 | 5.8% | |
| L | 136723 | 5.7% | |
| i | 16239 | 0.7% | |
| V | 12633 | 0.5% | |
| y | 12633 | 0.5% | |
| c | 1216 | 0.1% | |
| A | 1202 | 0.1% | |
| f | 1202 | 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 1202 | 99.3% | ||
| 2 | 2 | 0.2% | |
| 5 | 2 | 0.2% | |
| 8 | 1 | 0.1% | |
| 7 | 1 | 0.1% | |
| 4 | 1 | 0.1% | |
| 3 | 1 | 0.1% | |
| 6 | 1 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 2380658 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| t | 322053 | 13.5% | |
| h | 320851 | 13.5% | |
| n | 297007 | 12.5% | |
| a | 280112 | 11.8% | |
| e | 273446 | 11.5% | |
| l | 197963 | 8.3% | |
| o | 184128 | 7.7% | |
| C | 184114 | 7.7% | |
| r | 137925 | 5.8% | |
| L | 136723 | 5.7% | |
| i | 16239 | 0.7% | |
| V | 12633 | 0.5% | |
| y | 12633 | 0.5% | |
| c | 1216 | 0.1% | |
| A | 1202 | 0.1% | |
| f | 1202 | 0.1% | |
| 1202 | 0.1% | ||
| 2 | 2 | < 0.1% | |
| 5 | 2 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 7 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% |
BODYTYPE
Categorical
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 50 |
| Missing (%) | < 0.1% |
| Memory size | 3.6 MiB |
| SUV | |
|---|---|
| Truck | |
| Sedan | |
| Hatchback | |
| Coupe | 16396 |
| Other values (7) |
| Value | Count | Frequency (%) | |
| SUV | 184475 | 38.8% | |
| Truck | 144143 | 30.3% | |
| Sedan | 85608 | 18.0% | |
| Hatchback | 23796 | 5.0% | |
| Coupe | 16396 | 3.4% | |
| Cargo Van | 9773 | 2.1% | |
| Wagon | 5936 | 1.2% | |
| Convertible | 4036 | 0.8% | |
| Cab/Chassis | 1199 | 0.3% | |
| Minivan/Van | 243 | 0.1% | |
| RV | 17 | < 0.1% | |
| Cutaway Van | 1 | < 0.1% | |
| (Missing) | 50 | < 0.1% |
Frequencies of value counts
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 11 |
|---|---|
| Median length | 5 |
| Mean length | 4.575441532 |
| Min length | 2 |
Most occurring characters
| Value | Count | Frequency (%) | |
| S | 270083 | 12.4% | |
| V | 194509 | 8.9% | |
| c | 191735 | 8.8% | |
| U | 184475 | 8.5% | |
| k | 167939 | 7.7% | |
| a | 161619 | 7.4% | |
| u | 160540 | 7.4% | |
| r | 157952 | 7.3% | |
| T | 144143 | 6.6% | |
| e | 110076 | 5.1% | |
| n | 106183 | 4.9% | |
| d | 85608 | 3.9% | |
| o | 36141 | 1.7% | |
| C | 32604 | 1.5% | |
| b | 29031 | 1.3% | |
| t | 27833 | 1.3% | |
| h | 24995 | 1.1% | |
| H | 23796 | 1.1% | |
| p | 16396 | 0.8% | |
| g | 15709 | 0.7% | |
| 9774 | 0.4% | ||
| W | 5936 | 0.3% | |
| i | 5721 | 0.3% | |
| v | 4279 | 0.2% | |
| l | 4036 | 0.2% | |
| Other values (6) | 5301 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 1309392 | 60.2% | |
| Uppercase Letter | 855806 | 39.3% | |
| Space Separator | 9774 | 0.4% | |
| Other Punctuation | 1442 | 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| S | 270083 | 31.6% | |
| V | 194509 | 22.7% | |
| U | 184475 | 21.6% | |
| T | 144143 | 16.8% | |
| C | 32604 | 3.8% | |
| H | 23796 | 2.8% | |
| W | 5936 | 0.7% | |
| M | 243 | < 0.1% | |
| R | 17 | < 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| c | 191735 | 14.6% | |
| k | 167939 | 12.8% | |
| a | 161619 | 12.3% | |
| u | 160540 | 12.3% | |
| r | 157952 | 12.1% | |
| e | 110076 | 8.4% | |
| n | 106183 | 8.1% | |
| d | 85608 | 6.5% | |
| o | 36141 | 2.8% | |
| b | 29031 | 2.2% | |
| t | 27833 | 2.1% | |
| h | 24995 | 1.9% | |
| p | 16396 | 1.3% | |
| g | 15709 | 1.2% | |
| i | 5721 | 0.4% | |
| v | 4279 | 0.3% | |
| l | 4036 | 0.3% | |
| s | 3597 | 0.3% | |
| w | 1 | < 0.1% | |
| y | 1 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 9774 | 100.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| / | 1442 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 2165198 | 99.5% | |
| Common | 11216 | 0.5% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| S | 270083 | 12.5% | |
| V | 194509 | 9.0% | |
| c | 191735 | 8.9% | |
| U | 184475 | 8.5% | |
| k | 167939 | 7.8% | |
| a | 161619 | 7.5% | |
| u | 160540 | 7.4% | |
| r | 157952 | 7.3% | |
| T | 144143 | 6.7% | |
| e | 110076 | 5.1% | |
| n | 106183 | 4.9% | |
| d | 85608 | 4.0% | |
| o | 36141 | 1.7% | |
| C | 32604 | 1.5% | |
| b | 29031 | 1.3% | |
| t | 27833 | 1.3% | |
| h | 24995 | 1.2% | |
| H | 23796 | 1.1% | |
| p | 16396 | 0.8% | |
| g | 15709 | 0.7% | |
| W | 5936 | 0.3% | |
| i | 5721 | 0.3% | |
| v | 4279 | 0.2% | |
| l | 4036 | 0.2% | |
| s | 3597 | 0.2% | |
| Other values (4) | 262 | < 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 9774 | 87.1% | ||
| / | 1442 | 12.9% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 2176414 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| S | 270083 | 12.4% | |
| V | 194509 | 8.9% | |
| c | 191735 | 8.8% | |
| U | 184475 | 8.5% | |
| k | 167939 | 7.7% | |
| a | 161619 | 7.4% | |
| u | 160540 | 7.4% | |
| r | 157952 | 7.3% | |
| T | 144143 | 6.6% | |
| e | 110076 | 5.1% | |
| n | 106183 | 4.9% | |
| d | 85608 | 3.9% | |
| o | 36141 | 1.7% | |
| C | 32604 | 1.5% | |
| b | 29031 | 1.3% | |
| t | 27833 | 1.3% | |
| h | 24995 | 1.1% | |
| H | 23796 | 1.1% | |
| p | 16396 | 0.8% | |
| g | 15709 | 0.7% | |
| 9774 | 0.4% | ||
| W | 5936 | 0.3% | |
| i | 5721 | 0.3% | |
| v | 4279 | 0.2% | |
| l | 4036 | 0.2% | |
| Other values (6) | 5301 | 0.2% |
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 315340 |
| Missing (%) | 66.3% |
| Memory size | 3.6 MiB |
| Crew Cab | |
|---|---|
| Extended Cab | |
| Cargo Van | 9343 |
| Standard Cab | 7277 |
| Wagon | 4184 |
| Other values (4) | 1549 |
| Value | Count | Frequency (%) | |
| Crew Cab | 116521 | 24.5% | |
| Extended Cab | 21459 | 4.5% | |
| Cargo Van | 9343 | 2.0% | |
| Standard Cab | 7277 | 1.5% | |
| Wagon | 4184 | 0.9% | |
| Extended Cargo Van | 567 | 0.1% | |
| Passenger Van | 566 | 0.1% | |
| Extended Wagon | 414 | 0.1% | |
| Mega Cab | 2 | < 0.1% | |
| (Missing) | 315340 | 66.3% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 18 |
|---|---|
| Median length | 3 |
| Mean length | 4.943318204 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| n | 676037 | 28.8% | |
| a | 500705 | 21.3% | |
| C | 271690 | 11.6% | |
| e | 162535 | 6.9% | |
| 156716 | 6.7% | ||
| b | 145259 | 6.2% | |
| r | 134274 | 5.7% | |
| w | 116521 | 5.0% | |
| d | 59434 | 2.5% | |
| t | 29717 | 1.3% | |
| E | 22440 | 1.0% | |
| x | 22440 | 1.0% | |
| g | 15076 | 0.6% | |
| o | 14508 | 0.6% | |
| V | 10476 | 0.4% | |
| S | 7277 | 0.3% | |
| W | 4598 | 0.2% | |
| s | 1132 | < 0.1% | |
| P | 566 | < 0.1% | |
| M | 2 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 1877638 | 79.9% | |
| Uppercase Letter | 317049 | 13.5% | |
| Space Separator | 156716 | 6.7% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| n | 676037 | 36.0% | |
| a | 500705 | 26.7% | |
| e | 162535 | 8.7% | |
| b | 145259 | 7.7% | |
| r | 134274 | 7.2% | |
| w | 116521 | 6.2% | |
| d | 59434 | 3.2% | |
| t | 29717 | 1.6% | |
| x | 22440 | 1.2% | |
| g | 15076 | 0.8% | |
| o | 14508 | 0.8% | |
| s | 1132 | 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| C | 271690 | 85.7% | |
| E | 22440 | 7.1% | |
| V | 10476 | 3.3% | |
| S | 7277 | 2.3% | |
| W | 4598 | 1.5% | |
| P | 566 | 0.2% | |
| M | 2 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 156716 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 2194687 | 93.3% | |
| Common | 156716 | 6.7% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| n | 676037 | 30.8% | |
| a | 500705 | 22.8% | |
| C | 271690 | 12.4% | |
| e | 162535 | 7.4% | |
| b | 145259 | 6.6% | |
| r | 134274 | 6.1% | |
| w | 116521 | 5.3% | |
| d | 59434 | 2.7% | |
| t | 29717 | 1.4% | |
| E | 22440 | 1.0% | |
| x | 22440 | 1.0% | |
| g | 15076 | 0.7% | |
| o | 14508 | 0.7% | |
| V | 10476 | 0.5% | |
| S | 7277 | 0.3% | |
| W | 4598 | 0.2% | |
| s | 1132 | 0.1% | |
| P | 566 | < 0.1% | |
| M | 2 | < 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 156716 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 2351403 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| n | 676037 | 28.8% | |
| a | 500705 | 21.3% | |
| C | 271690 | 11.6% | |
| e | 162535 | 6.9% | |
| 156716 | 6.7% | ||
| b | 145259 | 6.2% | |
| r | 134274 | 5.7% | |
| w | 116521 | 5.0% | |
| d | 59434 | 2.5% | |
| t | 29717 | 1.3% | |
| E | 22440 | 1.0% | |
| x | 22440 | 1.0% | |
| g | 15076 | 0.6% | |
| o | 14508 | 0.6% | |
| V | 10476 | 0.4% | |
| S | 7277 | 0.3% | |
| W | 4598 | 0.2% | |
| s | 1132 | < 0.1% | |
| P | 566 | < 0.1% | |
| M | 2 | < 0.1% |
| Distinct | 136362 |
|---|---|
| Distinct (%) | 29.1% |
| Missing | 6952 |
| Missing (%) | 1.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 60952.48376 |
|---|---|
| Minimum | 1 |
| Maximum | 394967 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 14004 |
| Q1 | 31490 |
| median | 51890 |
| Q3 | 83964 |
| 95-th percentile | 133539 |
| Maximum | 394967 |
| Range | 394966 |
| Interquartile range (IQR) | 52474 |
Descriptive statistics
| Standard deviation | 38840.94138 |
|---|---|
| Coefficient of variation (CV) | 0.6372331197 |
| Kurtosis | 1.958794776 |
| Mean | 60952.48376 |
| Median Absolute Deviation (MAD) | 24110 |
| Skewness | 1.135226545 |
| Sum | 2.856970914e+10 |
| Variance | 1508618728 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 5 | 696 | 0.1% | |
| 10 | 513 | 0.1% | |
| 1 | 307 | 0.1% | |
| 13 | 305 | 0.1% | |
| 2 | 163 | < 0.1% | |
| 6 | 141 | < 0.1% | |
| 3 | 140 | < 0.1% | |
| 7 | 98 | < 0.1% | |
| 60000 | 92 | < 0.1% | |
| 80000 | 82 | < 0.1% | |
| 38000 | 81 | < 0.1% | |
| 65000 | 79 | < 0.1% | |
| 42000 | 79 | < 0.1% | |
| 30000 | 78 | < 0.1% | |
| 8 | 76 | < 0.1% | |
| 55000 | 74 | < 0.1% | |
| 98000 | 73 | < 0.1% | |
| 35000 | 73 | < 0.1% | |
| 61000 | 72 | < 0.1% | |
| 54000 | 71 | < 0.1% | |
| 4 | 69 | < 0.1% | |
| 56000 | 68 | < 0.1% | |
| 20000 | 67 | < 0.1% | |
| 25000 | 67 | < 0.1% | |
| 72000 | 67 | < 0.1% | |
| Other values (136337) | 465090 | 97.8% | |
| (Missing) | 6952 | 1.5% |
| Value | Count | Frequency (%) | |
| 1 | 307 | 0.1% | |
| 2 | 163 | < 0.1% | |
| 3 | 140 | < 0.1% | |
| 4 | 69 | < 0.1% | |
| 5 | 696 | 0.1% | |
| 6 | 141 | < 0.1% | |
| 7 | 98 | < 0.1% | |
| 8 | 76 | < 0.1% | |
| 9 | 51 | < 0.1% | |
| 10 | 513 | 0.1% |
| Value | Count | Frequency (%) | |
| 394967 | 1 | < 0.1% | |
| 391398 | 1 | < 0.1% | |
| 391202 | 1 | < 0.1% | |
| 390700 | 1 | < 0.1% | |
| 389281 | 1 | < 0.1% | |
| 386620 | 1 | < 0.1% | |
| 386078 | 1 | < 0.1% | |
| 385612 | 1 | < 0.1% | |
| 385314 | 1 | < 0.1% | |
| 384853 | 1 | < 0.1% |
| Distinct | 38717 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 44089 |
| Missing (%) | 9.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22849.67779 |
|---|---|
| Minimum | 781 |
| Maximum | 1690400 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.6 MiB |
Quantile statistics
| Minimum | 781 |
|---|---|
| 5-th percentile | 8995 |
| Q1 | 14495 |
| median | 19943 |
| Q3 | 29700 |
| 95-th percentile | 43995 |
| Maximum | 1690400 |
| Range | 1689619 |
| Interquartile range (IQR) | 15205 |
Descriptive statistics
| Standard deviation | 12258.94317 |
|---|---|
| Coefficient of variation (CV) | 0.5365039843 |
| Kurtosis | 1571.421816 |
| Mean | 22849.67779 |
| Median Absolute Deviation (MAD) | 6949 |
| Skewness | 15.05906768 |
| Sum | 9861555338 |
| Variance | 150281687.7 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 14995 | 2632 | 0.6% | |
| 15995 | 2612 | 0.5% | |
| 16995 | 2509 | 0.5% | |
| 13995 | 2484 | 0.5% | |
| 17995 | 2404 | 0.5% | |
| 12995 | 2361 | 0.5% | |
| 18995 | 2225 | 0.5% | |
| 19995 | 2223 | 0.5% | |
| 11995 | 2147 | 0.5% | |
| 10995 | 2076 | 0.4% | |
| 9995 | 2050 | 0.4% | |
| 15998 | 1672 | 0.4% | |
| 8995 | 1639 | 0.3% | |
| 14998 | 1549 | 0.3% | |
| 21995 | 1543 | 0.3% | |
| 16998 | 1483 | 0.3% | |
| 29995 | 1464 | 0.3% | |
| 24995 | 1432 | 0.3% | |
| 22995 | 1364 | 0.3% | |
| 23995 | 1331 | 0.3% | |
| 20995 | 1247 | 0.3% | |
| 7995 | 1232 | 0.3% | |
| 25995 | 1193 | 0.3% | |
| 17998 | 1187 | 0.2% | |
| 26995 | 1175 | 0.2% | |
| Other values (38692) | 386350 | 81.2% | |
| (Missing) | 44089 | 9.3% |
| Value | Count | Frequency (%) | |
| 781 | 1 | < 0.1% | |
| 971 | 1 | < 0.1% | |
| 1100 | 2 | < 0.1% | |
| 1200 | 1 | < 0.1% | |
| 1500 | 1 | < 0.1% | |
| 1700 | 1 | < 0.1% | |
| 1800 | 2 | < 0.1% | |
| 1900 | 2 | < 0.1% | |
| 2000 | 1 | < 0.1% | |
| 2100 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1690400 | 1 | < 0.1% | |
| 1400000 | 1 | < 0.1% | |
| 1095000 | 1 | < 0.1% | |
| 999800 | 2 | < 0.1% | |
| 978900 | 1 | < 0.1% | |
| 219000 | 1 | < 0.1% | |
| 213576 | 1 | < 0.1% | |
| 130020 | 1 | < 0.1% | |
| 130000 | 1 | < 0.1% | |
| 129360 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| POSTALCODE | STATE | MODELYEAR | MAKE | MODEL | TRIM | ENGINEDISPLACEMENT | ENGINECYLINDERCOUNT | TRANSMISSIONTYPE | DRIVETRAINTYPE | EXTERIORBASECOLOR | INTERIORMATERIAL | BODYTYPE | BODYCABSTYLE | ODOMETER | LISTPRICE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 30655 | GA | 2016 | Ford | Aerostar | NaN | 3.7 | 6.0 | Automatic | RWD | White | Vinyl | Cutaway Van | NaN | 87970.0 | 29991.0 |
| 1 | 46124 | IN | 2016 | Ford | Aerostar | NaN | 3.0 | 6.0 | Automatic | RWD | White | NaN | Minivan/Van | NaN | 109470.0 | 25990.0 |
| 2 | 1453 | MA | 2013 | Ford | C-Max Energi | SEL | 2.0 | 4.0 | CVT | FWD | Silver | Leather | Hatchback | NaN | 56000.0 | 8900.0 |
| 3 | 1453 | MA | 2013 | Ford | C-Max Energi | SEL | 2.0 | 4.0 | CVT | FWD | Green | Leather | Hatchback | NaN | 81778.0 | 12999.0 |
| 4 | 1581 | MA | 2013 | Ford | C-Max Energi | SEL | 2.0 | 4.0 | CVT | FWD | Gray | Leather | Hatchback | NaN | 79566.0 | 10998.0 |
| 5 | 1720 | MA | 2013 | Ford | C-Max Energi | SEL | 2.0 | 4.0 | CVT | FWD | White | Leather | Hatchback | NaN | 36077.0 | 10988.0 |
| 6 | 1906 | MA | 2013 | Ford | C-Max Energi | SEL | 2.0 | 4.0 | CVT | FWD | Silver | Leather | Hatchback | NaN | 102319.0 | 8795.0 |
| 7 | 2148 | MA | 2013 | Ford | C-Max Energi | SEL | 2.0 | 4.0 | CVT | FWD | Black | Leather | Hatchback | NaN | 66892.0 | NaN |
| 8 | 2346 | MA | 2013 | Ford | C-Max Energi | SEL | 2.0 | 4.0 | CVT | FWD | Silver | NaN | Hatchback | NaN | 124630.0 | 5898.0 |
| 9 | 2420 | MA | 2013 | Ford | C-Max Energi | SEL | 2.0 | 4.0 | CVT | FWD | Silver | Leather | Hatchback | Wagon | 51084.0 | 8338.0 |
Last rows
| POSTALCODE | STATE | MODELYEAR | MAKE | MODEL | TRIM | ENGINEDISPLACEMENT | ENGINECYLINDERCOUNT | TRANSMISSIONTYPE | DRIVETRAINTYPE | EXTERIORBASECOLOR | INTERIORMATERIAL | BODYTYPE | BODYCABSTYLE | ODOMETER | LISTPRICE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 475663 | 60165 | IL | 2018 | Ford | Utility Police Interceptor | Base | 3.7 | 6.0 | Automatic | AWD | White | NaN | SUV | NaN | 50301.0 | 23795.0 |
| 475664 | 60622 | IL | 2018 | Ford | Utility Police Interceptor | Base | 3.7 | 6.0 | Automatic | AWD | Black | NaN | SUV | NaN | 95779.0 | 14995.0 |
| 475665 | 60622 | IL | 2018 | Ford | Utility Police Interceptor | Base | 3.7 | 6.0 | Automatic | AWD | White | NaN | SUV | NaN | 46892.0 | 24795.0 |
| 475666 | 60647 | IL | 2018 | Ford | Utility Police Interceptor | Base | 3.7 | 6.0 | Automatic | AWD | NaN | NaN | SUV | NaN | 25632.0 | 25500.0 |
| 475667 | 84003 | UT | 2018 | Ford | Utility Police Interceptor | Base | 3.7 | 6.0 | Automatic | AWD | Black | Cloth | SUV | NaN | 35902.0 | 19889.0 |
| 475668 | 84003 | UT | 2018 | Ford | Utility Police Interceptor | Base | 3.7 | 6.0 | Automatic | AWD | Black | Cloth | SUV | NaN | 19220.0 | 21727.0 |
| 475669 | 97266 | OR | 2018 | Ford | Utility Police Interceptor | Base | 3.7 | 6.0 | Automatic | AWD | Blue | NaN | SUV | NaN | 43167.0 | 16791.0 |
| 475670 | 97266 | OR | 2018 | Ford | Utility Police Interceptor | Base | 3.7 | 6.0 | Automatic | AWD | White | NaN | SUV | NaN | 26559.0 | 16991.0 |
| 475671 | 97266 | OR | 2018 | Ford | Utility Police Interceptor | Base | 3.7 | 6.0 | Automatic | AWD | White | NaN | SUV | NaN | 22498.0 | 17491.0 |
| 475672 | 97266 | OR | 2018 | Ford | Utility Police Interceptor | Base | 3.7 | 6.0 | Automatic | AWD | Black | NaN | SUV | NaN | 4601.0 | 17791.0 |
Most frequent
| POSTALCODE | STATE | MODELYEAR | MAKE | MODEL | TRIM | ENGINEDISPLACEMENT | ENGINECYLINDERCOUNT | TRANSMISSIONTYPE | DRIVETRAINTYPE | EXTERIORBASECOLOR | INTERIORMATERIAL | BODYTYPE | BODYCABSTYLE | ODOMETER | LISTPRICE | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 28677 | NC | 2018 | Ford | Transit-250 | Base | 3.7 | 6.0 | Automatic | RWD | White | Vinyl | Cargo Van | Cargo Van | 5.0 | 34182.0 | 4 |
| 2 | 46526 | IN | 2017 | Ford | F-650SD | Base | 6.0 | 10.0 | Automatic | RWD | White | Cloth | Truck | Standard Cab | 2.0 | 107999.0 | 3 |
| 1 | 40601 | KY | 2015 | Ford | F-250SD | XL | 6.2 | 8.0 | Automatic | 4WD | White | Vinyl | Truck | Standard Cab | 163000.0 | 24900.0 | 2 |
| 3 | 48118 | MI | 2017 | Ford | Transit Connect | XL | 2.5 | 4.0 | Automatic | FWD | White | Vinyl | Cargo Van | Cargo Van | 69488.0 | 16455.0 | 2 |
| 4 | 55374 | MN | 2016 | Ford | F-150 | Lariat | 2.7 | 6.0 | Automatic | 4WD | Black | Leather | Truck | Extended Cab | 84506.0 | 28900.0 | 2 |
| 5 | 60014 | IL | 2017 | Ford | F-150 | XLT | 5.0 | 8.0 | Automatic | 4WD | Brown | Cloth | Truck | Crew Cab | 60842.0 | 31995.0 | 2 |
| 6 | 64601 | MO | 2016 | Ford | F-150 | XLT | 5.0 | 8.0 | Automatic | 4WD | Black | Cloth | Truck | Crew Cab | 80565.0 | 30395.0 | 2 |
| 7 | 68008 | NE | 2017 | Ford | F-150 | Lariat | 5.0 | 8.0 | Automatic | 4WD | White | Leather | Truck | Crew Cab | 28213.0 | 43823.0 | 2 |
| 8 | 73762 | OK | 2018 | Ford | F-550SD | XL | 6.0 | 10.0 | Automatic | 4WD | White | Vinyl | Truck | Extended Cab | 5.0 | 51995.0 | 2 |
| 9 | 85204 | AZ | 2018 | Ford | Transit-250 | Base | 3.7 | 6.0 | Automatic | RWD | White | Vinyl | Cargo Van | Cargo Van | 5.0 | 34795.0 | 2 |